GPU-Accelerated Asynchronous Error Correction for Mixed Precision Iterative Refinement
نویسندگان
چکیده
In hardware-aware high performance computing, blockasynchronous iteration and mixed precision iterative refinement are two techniques that are applied to leverage the computing power of SIMD accelerators like GPUs. Although they use a very different approach for this purpose, they share the basic idea of compensating the convergence behaviour of an inferior numerical algorithm by a more efficient usage of the provided computing power. In this paper, we want to analyze the potential of combining both techniques. Therefore, we implement a mixed precision iterative refinement algorithm using a block-asynchronous iteration as an error correction solver, and compare its performance with a pure implementation of a block-asynchronous iteration and an iterative refinement method using double precision for the error correction solver. For matrices from the University of Florida Matrix collection, we report the convergence behaviour and provide the total solver runtime using different GPU architectures.
منابع مشابه
Unleashing CPU-GPU Acceleration for Control Theory Applications
In this paper we review the effect of two high-performance techniques for the solution of matrix equations arising in control theory applications on CPU-GPU platforms, in particular advanced optimization via look-ahead and iterative refinement. Our experimental evaluation on the last GPU-generation from NVIDIA, “Kepler”, shows the slight advantage of matrix inversion via Gauss-Jordan eliminatio...
متن کاملMixed Precision Dense Linear System Solvers for High Performance Reconfigurable Computing
The iterative refinement method for linear system solvers can improve performance while maintaining numeric accuracy. Previous work addressing iterative refinement exploits single precision and double precision for CPU, GPU, or Cell/BE processors. Due to only two different precisions supported, iterative refinement is limited on those platforms. Reconfigurable Computing (RC) is a great candidat...
متن کاملSolving dense symmetric indefinite systems using GPUs
This paper studies the performance of different algorithms for solving a dense symmetric indefinite linear system of equations on multicore CPUs with a Graphics Processing Unit (GPU). To ensure the numerical stability of the factorization, pivoting is required. Obtaining high performance of such algorithms on the GPU is difficult because all the existing pivoting strategies lead to frequent syn...
متن کاملAccelerating the Solution of Linear Systems by Iterative Refinement in Three Precisions∗
We propose a general algorithm for solving a n×n nonsingular linear system Ax = b based on iterative refinement with three precisions. The working precision is combined with possibly different precisions for solving for the correction term and for computing the residuals. Via rounding error analysis of the algorithm we derive sufficient conditions for convergence and bounds for the attainable n...
متن کاملExploiting the capabilities of modern GPUs for dense matrix computations
We present several algorithms to compute the solution of a linear system of equations on a GPU, as well as general techniques to improve their performance, such as padding and hybrid GPU-CPU computation. We compare single and double precision performance of a modern GPU with unified architecture, and show how iterative refinement with mixed precision can be used to regain full accuracy in the s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012